Data-driven Random Fourier Features using Stein Effect

نویسندگان

  • Wei-Cheng Chang
  • Chun-Liang Li
  • Yiming Yang
  • Barnabás Póczos
چکیده

Large-scale kernel approximation is an important problem in machine learning research. Approaches using random Fourier features have become increasingly popular [Rahimi and Recht, 2007], where kernel approximation is treated as empirical mean estimation via Monte Carlo (MC) or Quasi-Monte Carlo (QMC) integration [Yang et al., 2014]. A limitation of the current approaches is that all the features receive an equal weight summing to 1. In this paper, we propose a novel shrinkage estimator from ”Stein effect”, which provides a data-driven weighting strategy for random features and enjoys theoretical justifications in terms of lowering the empirical risk. We further present an efficient stochastic algorithm for large-scale applications of the proposed method. Our empirical results on six benchmark data sets demonstrate the advantageous performance of this approach over representative baselines in both kernel approximation and supervised learning tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Dependent Kernel Approximation using Pseudo Random Fourier Features

Kernel methods are powerful and flexible approach to solve many problems in machine learning. Due to the pairwise evaluations in kernel methods, the complexity of kernel computation grows as the data size increases; thus the applicability of kernel methods is limited for large scale datasets. Random Fourier Features (RFF) has been proposed to scale the kernel method for solving large scale data...

متن کامل

A Data-driven Method for Crowd Simulation using a Holonification Model

In this paper, we present a data-driven method for crowd simulation with holonification model. With this extra module, the accuracy of simulation will increase and it generates more realistic behaviors of agents. First, we show how to use the concept of holon in crowd simulation and how effective it is. For this reason, we use simple rules for holonification. Using real-world data, we model the...

متن کامل

Detection of high impedance faults in distribution networks using Discrete Fourier Transform

In this paper, a new method for extracting dynamic properties for High Impedance Fault (HIF) detection using discrete Fourier transform (DFT) is proposed. Unlike conventional methods that use features extracted from data windows after fault to detect high impedance fault, in the proposed method, using the disturbance detection algorithm in the network, the normalized changes of the selected fea...

متن کامل

The Error Probability of Random Fourier Features is Dimensionality Independent

We show that the error probability of reconstructing kernel matrices from Random Fourier Features for any shift-invariant kernel function is at most O(exp(−D)), where D is the number of random features. We also provide a matching informationtheoretic method-independent lower bound of Ω(exp(−D)) for standard Gaussian distributions. Compared to prior work, we are the first to show that the error ...

متن کامل

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison

Both random Fourier features and the Nyström method have been successfully applied to efficient kernel learning. In this work, we investigate the fundamental difference between these two approaches, and how the difference could affect their generalization performances. Unlike approaches based on random Fourier features where the basis functions (i.e., cosine and sine functions) are sampled from...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017